Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft for a buffered writing of files #183

Merged
merged 17 commits into from
Jun 17, 2024
Merged

Conversation

lesquoyb
Copy link
Contributor

It's a work in progress, I'm currently testing it and not sure yet it's actually more performant.
I tested it with this model (thanks @ptaillandier) and for now it seems to be way better, but it needs more work to be sure:

/**
* Name: Writingfile
* Based on the internal skeleton template. 
* Author: patricktaillandier
* Tags: 
*/

model Writingfile

global {
	int nb_writers <- 1000;
	int nb_cycle <- 100;
	string output_file <- "output.csv";
	
	init {
		create writer number: nb_writers;
		save "cycle,name,rnd(1.0),rnd(1.0)" to: output_file format: "text";
		
	}
}

species writer {
	reflex save_result {
		save string(cycle) + ","+name +"," + rnd(1.0) + "," + rnd(1.0) to: output_file format: "text" rewrite: false;
		
		save [cycle, name,rnd(1.0),rnd(1.0)] to: output_file format: "csv"  rewrite: false;
	}
}

experiment test_xp type: batch repeat: 1 keep_simulations: false until:cycle = nb_cycle  {
	float seed <- 1.0;
	float t;
	init {
		 t <- gama.machine_time ;
	}
	reflex results {
		write "Simulation time (in s): " + ((gama.machine_time - t)/ 1000)with_precision 2;
	}
}

lesquoyb added 2 commits May 23, 2024 15:00
- for now it's in testing phase, so I flush at the end of the simulation, it needs to be parametrised
- adds ask methods with CharSequence in addition to String
- need to add an operator to flush ?
Copy link

@codescene-delta-analysis codescene-delta-analysis bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Health Quality Gates: OK

  • Declining Code Health: 1 findings(s) 🚩
  • Improving Code Health: 1 findings(s) ✅
  • Affected Hotspots: 1 files(s) 🔥

View detailed results in CodeScene

lesquoyb added 3 commits May 29, 2024 15:46
- stronger typing of owners (switching to SimulationAgent instead of Object)
- replacing OwnerWriteAsks class by a simple Map<SimulationAgent, StringBuilder>
- adding enum for the buffering strategies
- adding a different map to manage the cycle based buffering
- removing unused OwnerWriteAsksQueue  class
- adding different methods to flush according to a given strategy
- flushing is now strictly for an owner, before it would also flush the previous writes of different owners in a file
Copy link

@codescene-delta-analysis codescene-delta-analysis bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Health Quality Gates: FAILED

  • Declining Code Health: 1 findings(s) 🚩
  • Improving Code Health: 4 findings(s) ✅
  • Affected Hotspots: 1 files(s) 🔥

View detailed results in CodeScene

lesquoyb added 2 commits May 29, 2024 22:55
- adds a parameter into the ISaveStatement
- make the flush operation continue even if it fails on one file
- cleanup a bit
Copy link

@codescene-delta-analysis codescene-delta-analysis bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Health Quality Gates: FAILED

  • Declining Code Health: 6 findings(s) 🚩
  • Improving Code Health: 5 findings(s) ✅
  • Affected Hotspots: 1 files(s) 🔥

View detailed results in CodeScene

@lesquoyb
Copy link
Contributor Author

lesquoyb commented May 29, 2024

I've finished implementing the file buffering in the csv and text delegates for the save statement.
Currently users can pick a buffering strategy with the facet buffering of the save statement.
Available strategies are:

  • no_buffering, which is the old behaviour, content is directly written in the file when the save statement is reached
  • per_cycle, the content is buffered and will be written in the file at the end of a cycle
  • per_simulation, the content is buffered and will be written in the file when closing the simulation
    In the same simulation you can mix save calls with those three buffering strategies and they will be solved individually.

Here is an example model to showcase the different behaviors:


model buffering

global {
	
	
	reflex at_cycle {
		save "at cycle " + cycle to:"data.csv" header:false rewrite:false buffering:"per_cycle";
	}
	reflex at_cycle2 {
		save "at cycle2 " + cycle + " should appear after 'at cycle "+ cycle+"' as it's asked in that order"to:"data.csv" header:false rewrite:false buffering:"per_cycle";
	}
	reflex no_buffering {
		save "at cycle " + cycle + " too, should appear before all the other, as it's executed right when the code is reached" rewrite:false to:"data.csv" header:false buffering:"no_buffering";
	}
	reflex end_of_simulation {
		save "Run at cycle " + cycle + " but should be appended at the end of the file" to:"data.csv" header:false rewrite:false buffering:"per_simulation";
	}
	
}

experiment a type:batch until:cycle=10 autorun:true{

}

Here is what's left to do:

  • add a default buffering strategy in the settings
  • add the buffering strategies in the save of the GamaFile objects
  • handle the case of rewrite (seems to work rn but not fully tested)
  • handle other formats than text and csv
  • cache the file objects ? at least remove the file object opening from the save statement as it will be recreated in the WriteController. Must pay attention to the case of rewrite
  • parallelize the flushing of files for one owner (must either wait them all before leaving the function or implement a lock because two simulations can write the same files)
  • add operators to flush manually a file or the whole simulation

Copy link

@codescene-delta-analysis codescene-delta-analysis bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Health Quality Gates: FAILED

  • Declining Code Health: 7 findings(s) 🚩
  • Improving Code Health: 5 findings(s) ✅
  • Affected Hotspots: 1 files(s) 🔥

View detailed results in CodeScene

lesquoyb added 3 commits May 30, 2024 14:03
- adds a SaveOptions class to carry all saving options
- buffered writing now manage the charset
- buffered writing now handles the rewrite facets
- adds a WriteTask class to carry informations about encoding and rewriting in buffered writing
- adds rewrite option in json writer (was in append mode by default)
@lesquoyb lesquoyb marked this pull request as ready for review May 30, 2024 10:56
@lesquoyb
Copy link
Contributor Author

lesquoyb commented May 30, 2024

I have a functional first version, @AlexisDrogoul could you have a look and tell me if you see anything wrong ?
I think WriteController.java is in the wrong package but I'm not sure where it should go.

And here is a model to test the functionalities:

/**
* Name: buffered
* Based on the internal empty template. 
* Author: baptiste
* Tags: 
*/



model buffering

global {
	
	
	reflex at_cycle {
		save "at cycle " + cycle to:"data.csv" header:false rewrite:false buffering:"per_cycle";
	}
	reflex at_cycle2 {
		save "at cycle2 " + cycle + " should appear after 'at cycle "+ cycle+"' as it's asked in that order"to:"data.csv" header:false rewrite:false buffering:"per_cycle";
	}
	reflex no_buffering {
		save "at cycle " + cycle + " too, should appear before all the other, as it's executed right when the code is reached" rewrite:false to:"data.csv" header:false buffering:"no_buffering";
	}
	reflex end_of_simulation {
		save "Run at cycle " + cycle + " but should be appended at the end of the file" to:"data.csv" header:false rewrite:false buffering:"per_simulation";
	}
	
	reflex combo_breaker when:cycle=5{
		let s <- flush_all_files(simulation);
	}
	
}

experiment a type:batch until:cycle=10 autorun:true{

}

@lesquoyb lesquoyb requested a review from AlexisDrogoul May 31, 2024 02:16
Copy link

@codescene-delta-analysis codescene-delta-analysis bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Health Quality Gates: OK

  • Declining Code Health: 3 findings(s) 🚩
  • Improving Code Health: 20 findings(s) ✅
  • Affected Hotspots: 1 files(s) 🔥

View detailed results in CodeScene

@@ -77,24 +75,68 @@ public void save(final IScope scope, final IExpression item, final OutputStream
* Signals that an I/O exception has occurred.
*/
@Override
public void save(final IScope scope, final IExpression item, final File file, final String code,
final boolean addHeader, final String type, final Object attributesToSave)
public void save(final IScope scope, final IExpression item, final File file, final SaveOptions saveOptions)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Getting better: Complex Method
save decreases in cyclomatic complexity from 24 to 20, threshold = 9

Comment on lines +252 to +253
final IExpression bufferingStrategy = desc.getFacetExpr(IKeyword.BUFFERING);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ Getting worse: Complex Method
SaveStatement.SaveValidator.validate increases in cyclomatic complexity from 50 to 51, threshold = 9

@@ -58,18 +59,20 @@ public class ImageSaver extends AbstractSaver {
* Signals that an I/O exception has occurred.
*/
@Override
public void save(final IScope scope, final IExpression item, final File file, final String code,
final boolean addHeader, final String type, final Object attributesToSave) throws IOException {
public void save(final IScope scope, final IExpression item, final File file, final SaveOptions options) throws IOException {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Getting worse: Complex Method
save increases in cyclomatic complexity from 12 to 13, threshold = 9

Suppress

Comment on lines +18 to +27
public SaveOptions(final String code, final boolean addHeader, final String type, final Object attributesToSave,
BufferingStrategies bufferingStrategy, final boolean rewrite) {
this.code = code;
this.addHeader = addHeader;
this.type = type;
this.attributesToSave = attributesToSave;
this.bufferingStrategy = bufferingStrategy;
this.rewrite = rewrite;
writeCharset = StandardCharsets.UTF_8;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ New issue: Constructor Over-Injection
SaveOptions has 6 arguments, threshold = 5

Suppress

gama.core/src/gama/gaml/operators/Files.java Show resolved Hide resolved
gama.core/src/gama/gaml/operators/Files.java Show resolved Hide resolved
- renames writecontroller to bufferingcontroller and writetask to bufferingtask
- makes bufferingtask inner static
- adds maps for buffered write statements in bufferingcontroller
- adds base functions to handle buffered write statements in bufferingcontroller
- some light refactoring
- fixes documentation of flush_all_files
Copy link

@codescene-delta-analysis codescene-delta-analysis bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Health Quality Gates: OK

  • Declining Code Health: 3 findings(s) 🚩
  • Improving Code Health: 20 findings(s) ✅
  • Affected Hotspots: 1 files(s) 🔥

View detailed results in CodeScene

}

// If the last element of the list is not of the same color as the currently requested color we append a new task with the new color
if (requests.size() == 0 || (requests.get(requests.size()-1).color != null && !requests.get(requests.size()-1).color.equals(color))) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ New issue: Complex Conditional
appendWriteConsoleRequestToMap has 1 complex conditionals with 2 branches, threshold = 2

Suppress

…statement

- moves some part of the buffering strategies code from savestatement to BufferingController. Should probably by moved to a separate class when refactoring
- some refactoring/renaming
- Adds the per_agent strategy in save and write statement. The data would be written when the calling agent is disposed.  Need to check that this does not impact performances on simulations with many agents
- append a Strings.LN in each buffering call. This should later be removed and done directly in the write statement
- some formatting on Default Scheduler.gaml because why not
Copy link

@codescene-delta-analysis codescene-delta-analysis bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Health Quality Gates: OK

  • Declining Code Health: 4 findings(s) 🚩
  • Improving Code Health: 20 findings(s) ✅
  • Affected Hotspots: 1 files(s) 🔥

View detailed results in CodeScene

lesquoyb added 3 commits June 11, 2024 11:43
- minor fixes for spotbugs
- fixes removing last characters in case of no buffering (forgotten code from the addition of end facet)
- better names
- documents some methods
Copy link

@codescene-delta-analysis codescene-delta-analysis bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Health Quality Gates: OK

  • Declining Code Health: 3 findings(s) 🚩
  • Improving Code Health: 20 findings(s) ✅
  • Affected Hotspots: 1 files(s) 🔥

View detailed results in CodeScene

@lesquoyb
Copy link
Contributor Author

I'm merging this as it seems to be working and I opened separate issues for the remaining improvements I described above

@lesquoyb lesquoyb merged commit 1670998 into 2024-06 Jun 17, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

1 participant